Goto

Collaborating Authors

 low-rank adaptation method


Effective Fine-Tuning of Vision Transformers with Low-Rank Adaptation for Privacy-Preserving Image Classification

Lin, Haiwei, Imaizumi, Shoko, Kiya, Hitoshi

arXiv.org Artificial Intelligence

--We propose a low-rank adaptation method for training privacy-preserving vision transformer (ViT) models that efficiently freezes pre-trained ViT model weights. In the proposed method, trainable rank decomposition matrices are injected into each layer of the ViT architecture, and moreover, the patch embedding layer is not frozen, unlike in the case of the conventional low-rank adaptation methods. The proposed method allows us not only to reduce the number of trainable parameters but to also maintain almost the same accuracy as that of full-time tuning. The importance of vision transformer (ViT) based-models [1] has been increasing in recent years. ViT -based models can be applied to vision-language tasks [2] in addition to image classification, object detection [3], and semantic segmentation tasks [4].